Cohen’s Kappa (cohen_kappa_score)#

Cohen’s kappa (\(\kappa\)) measures agreement beyond chance between two labelers. In ML, it’s commonly used to evaluate a classifier against ground truth, especially under class imbalance.

\[ \kappa = \frac{p_o - p_e}{1 - p_e} \]
  • \(p_o\) = observed agreement (diagonal of the confusion matrix; for classifier-vs-truth it equals accuracy)

  • \(p_e\) = expected agreement under independent marginals (chance agreement given the label prevalences)

Range: \(\kappa\in[-1, 1]\) (1 = perfect agreement, 0 = chance-level, <0 = worse than chance).

Goals#

  • Build intuition for why \(\kappa\) differs from accuracy

  • Derive \(\kappa\) from the confusion matrix (binary + multiclass)

  • Implement \(\kappa\) from scratch in NumPy (including weighted \(\kappa\) for ordinal labels)

  • Use \(\kappa\) in a simple optimization loop (threshold selection for a logistic regression model)

Prerequisites#

  • Confusion matrix basics

  • Basic probability (marginals, independence)

import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

from plotly.subplots import make_subplots

from sklearn.metrics import cohen_kappa_score
from sklearn.model_selection import train_test_split

pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
np.set_printoptions(precision=4, suppress=True)

rng = np.random.default_rng(42)

1) Why \(\kappa\) (and not just accuracy)?#

Accuracy counts raw agreement. If the data are imbalanced, a trivial classifier can look “good” by mostly predicting the majority class.

\(\kappa\) adjusts for the agreement you would expect by chance, given the label frequencies.

# A trivial “majority class” classifier
y_true = np.r_[np.zeros(950, dtype=int), np.ones(50, dtype=int)]
y_pred = np.zeros_like(y_true)

accuracy = (y_true == y_pred).mean()
kappa = cohen_kappa_score(y_true, y_pred)

print(f"accuracy = {accuracy:.3f}")
print(f"kappa    = {kappa:.3f}")
accuracy = 0.950
kappa    = 0.000

2) The math (confusion matrix → \(\kappa\))#

Let there be \(K\) classes and a confusion matrix \(C\in\mathbb{N}^{K\times K}\) where \(C_{ij}\) counts samples with true class \(i\) predicted as class \(j\). Let \(N = \sum_{i,j} C_{ij}\).

Observed agreement (diagonal mass): $\( p_o = \frac{1}{N}\sum_{i=1}^K C_{ii} \)$

Define the marginal class frequencies (as proportions): $\( r_i = \frac{1}{N}\sum_{j=1}^K C_{ij} \quad (\text{true prevalence}), \qquad c_i = \frac{1}{N}\sum_{j=1}^K C_{ji} \quad (\text{predicted prevalence}). \)$

If true and predicted labels were independent but kept the same marginals, the expected agreement is: $\( p_e = \sum_{i=1}^K r_i\,c_i \)$

Finally: $\( \kappa = \frac{p_o - p_e}{1 - p_e} \)$

Notes:

  • \(\kappa=1\) when \(p_o=1\).

  • \(\kappa=0\) when \(p_o=p_e\) (no better than chance under the marginals).

  • \(\kappa<0\) when agreement is worse than chance.

def confusion_matrix_numpy(
    y_true: np.ndarray,
    y_pred: np.ndarray,
    labels: np.ndarray | None = None,
) -> tuple[np.ndarray, np.ndarray]:
    """Confusion matrix C where C[i, j] counts true=labels[i], pred=labels[j]."""
    y_true = np.asarray(y_true)
    y_pred = np.asarray(y_pred)

    if y_true.shape != y_pred.shape:
        raise ValueError("y_true and y_pred must have the same shape.")

    if labels is None:
        labels = np.unique(np.concatenate([y_true, y_pred]))
    labels = np.asarray(labels)

    label_to_index = {label: i for i, label in enumerate(labels.tolist())}
    true_idx = np.array([label_to_index[label] for label in y_true], dtype=int)
    pred_idx = np.array([label_to_index[label] for label in y_pred], dtype=int)

    k = labels.size
    cm = np.zeros((k, k), dtype=int)
    np.add.at(cm, (true_idx, pred_idx), 1)
    return cm, labels


def kappa_components_from_cm(cm: np.ndarray) -> tuple[float, float]:
    """Return (p_o, p_e) computed from a confusion matrix."""
    cm = np.asarray(cm)
    n = cm.sum()
    if n == 0:
        raise ValueError("Confusion matrix has zero total count.")

    p_o = float(np.trace(cm) / n)
    row = cm.sum(axis=1) / n
    col = cm.sum(axis=0) / n
    p_e = float(row @ col)
    return p_o, p_e


def kappa_weight_matrix(n_classes: int, weights: str | np.ndarray | None) -> np.ndarray:
    """Disagreement weights W (0 on diagonal)."""
    if n_classes < 1:
        raise ValueError("n_classes must be >= 1")

    if weights is None:
        w = np.ones((n_classes, n_classes), dtype=float)
        np.fill_diagonal(w, 0.0)
        return w

    if isinstance(weights, str):
        if weights not in {"linear", "quadratic"}:
            raise ValueError("weights must be None, 'linear', 'quadratic', or a (K,K) array.")

        if n_classes == 1:
            return np.zeros((1, 1), dtype=float)

        idx = np.arange(n_classes)
        i, j = np.meshgrid(idx, idx, indexing="ij")
        if weights == "linear":
            return (np.abs(i - j) / (n_classes - 1)).astype(float)
        return (((i - j) ** 2) / ((n_classes - 1) ** 2)).astype(float)

    w = np.asarray(weights, dtype=float)
    if w.shape != (n_classes, n_classes):
        raise ValueError(f"custom weights must have shape ({n_classes},{n_classes})")
    return w


def cohen_kappa_score_numpy(
    y_true: np.ndarray,
    y_pred: np.ndarray,
    labels: np.ndarray | None = None,
    weights: str | np.ndarray | None = None,
) -> float:
    """Cohen's kappa from scratch (unweighted or weighted)."""
    cm, labels = confusion_matrix_numpy(y_true, y_pred, labels=labels)
    n = cm.sum()
    if n == 0:
        raise ValueError("No samples.")

    o = cm / n
    row = o.sum(axis=1)
    col = o.sum(axis=0)
    e = np.outer(row, col)

    w = kappa_weight_matrix(labels.size, weights)
    denom = float((w * e).sum())
    num = float((w * o).sum())

    # Degenerate case: expected disagreement is zero (perfectly concentrated marginals)
    if np.isclose(denom, 0.0):
        return 1.0

    return 1.0 - (num / denom)
# Sanity check: match scikit-learn
y_true = rng.integers(0, 4, size=1_000)
y_pred = rng.integers(0, 4, size=1_000)

k_np = cohen_kappa_score_numpy(y_true, y_pred)
k_sk = cohen_kappa_score(y_true, y_pred)
print("unweighted", k_np, k_sk)
assert np.isclose(k_np, k_sk)

k_np_lin = cohen_kappa_score_numpy(y_true, y_pred, weights="linear")
k_sk_lin = cohen_kappa_score(y_true, y_pred, weights="linear")
print("linear   ", k_np_lin, k_sk_lin)
assert np.isclose(k_np_lin, k_sk_lin)

k_np_quad = cohen_kappa_score_numpy(y_true, y_pred, weights="quadratic")
k_sk_quad = cohen_kappa_score(y_true, y_pred, weights="quadratic")
print("quadratic", k_np_quad, k_sk_quad)
assert np.isclose(k_np_quad, k_sk_quad)
unweighted -0.006566010065659933 -0.006566010065659933
linear    -0.02898015572377166 -0.02898015572377166
quadratic -0.05021724810620465 -0.050217248106204426

3) Visual intuition: observed vs expected agreement#

\(\kappa\) depends on both:

  • how much mass is on the diagonal (observed agreement)

  • how concentrated the marginals are (chance agreement)

Let’s look at a small multiclass example.

def plot_confusion_matrix(cm: np.ndarray, labels: np.ndarray, title: str) -> go.Figure:
    fig = px.imshow(
        cm,
        x=[str(l) for l in labels],
        y=[str(l) for l in labels],
        text_auto=True,
        color_continuous_scale="Blues",
        title=title,
        labels={"x": "pred", "y": "true", "color": "count"},
    )
    fig.update_yaxes(autorange="reversed")
    fig.update_layout(coloraxis_showscale=False)
    return fig


def plot_true_vs_pred_marginals(cm: np.ndarray, labels: np.ndarray, title: str) -> go.Figure:
    n = cm.sum()
    true_marg = cm.sum(axis=1) / n
    pred_marg = cm.sum(axis=0) / n

    fig = go.Figure()
    fig.add_trace(go.Bar(x=[str(l) for l in labels], y=true_marg, name="true", opacity=0.8))
    fig.add_trace(go.Bar(x=[str(l) for l in labels], y=pred_marg, name="pred", opacity=0.8))
    fig.update_layout(
        barmode="group",
        title=title,
        xaxis_title="class",
        yaxis_title="proportion",
    )
    fig.update_yaxes(range=[0, 1])
    return fig
labels = np.array([0, 1, 2])
y_true = np.r_[np.zeros(200, dtype=int), np.ones(150, dtype=int), np.full(50, 2, dtype=int)]

y_pred = y_true.copy()
flip_idx = rng.choice(y_true.size, size=80, replace=False)
y_pred[flip_idx] = (y_true[flip_idx] + rng.integers(1, 3, size=flip_idx.size)) % 3

cm, labels_used = confusion_matrix_numpy(y_true, y_pred, labels=labels)
p_o, p_e = kappa_components_from_cm(cm)
kappa = cohen_kappa_score_numpy(y_true, y_pred, labels=labels)

print(f"p_o (observed) = {p_o:.3f}")
print(f"p_e (expected) = {p_e:.3f}")
print(f"kappa          = {kappa:.3f}")

plot_confusion_matrix(cm, labels_used, title="Confusion matrix (example)")
p_o (observed) = 0.800
p_e (expected) = 0.390
kappa          = 0.672
plot_true_vs_pred_marginals(cm, labels_used, title="True vs predicted marginals")

4) How prevalence changes \(\kappa\)#

Even if two classifiers have the same accuracy, \(\kappa\) can differ depending on class prevalence because \(p_e\) changes with the marginals.

Below we simulate a binary classifier with a fixed error rate: we flip the true label with probability \(q\). Accuracy stays at \(1-q\), but \(\kappa\) changes as the positive class prevalence moves toward 0 or 1.

def simulate_symmetric_noise(prevalences: np.ndarray, q: float, n: int = 50_000) -> tuple[np.ndarray, np.ndarray]:
    accs: list[float] = []
    kappas: list[float] = []
    for pi in prevalences:
        y = (rng.random(n) < pi).astype(int)
        flip = rng.random(n) < q
        y_hat = y.copy()
        y_hat[flip] = 1 - y_hat[flip]
        accs.append(float((y == y_hat).mean()))
        kappas.append(cohen_kappa_score_numpy(y, y_hat))
    return np.array(accs), np.array(kappas)


prevalences = np.linspace(0.02, 0.98, 60)
q = 0.10
accs, kappas = simulate_symmetric_noise(prevalences, q=q)

fig = go.Figure()
fig.add_trace(go.Scatter(x=prevalences, y=accs, mode="lines", name="accuracy"))
fig.add_trace(go.Scatter(x=prevalences, y=kappas, mode="lines", name="kappa"))
fig.update_layout(
    title=f"Fixed error rate q={q:.2f}: accuracy vs kappa across prevalence",
    xaxis_title="positive class prevalence π = P(y=1)",
    yaxis_title="metric value",
)
fig.update_yaxes(range=[-0.1, 1.0])
fig
# Majority-class predictor: accuracy can be high, but kappa stays ~0
prevalences = np.linspace(0.02, 0.98, 60)
n = 50_000

accs: list[float] = []
kappas: list[float] = []
for pi in prevalences:
    y = (rng.random(n) < pi).astype(int)
    y_hat = np.zeros_like(y)
    accs.append(float((y == y_hat).mean()))
    kappas.append(cohen_kappa_score_numpy(y, y_hat))

fig = go.Figure()
fig.add_trace(go.Scatter(x=prevalences, y=accs, mode="lines", name="accuracy"))
fig.add_trace(go.Scatter(x=prevalences, y=kappas, mode="lines", name="kappa"))
fig.update_layout(
    title="Always-predict-0 baseline across prevalence",
    xaxis_title="positive class prevalence π = P(y=1)",
    yaxis_title="metric value",
)
fig.update_yaxes(range=[-0.1, 1.0])
fig

5) Weighted \(\kappa\) (ordinal labels)#

If classes are ordered (e.g., 1–5 star ratings), “off by 1” is usually less severe than “off by 4”. Weighted kappa replaces exact agreement with a weighted disagreement.

Let \(O_{ij}=C_{ij}/N\) be the observed joint distribution and \(E_{ij}=r_i c_j\) the expected joint distribution. Choose disagreement weights \(w_{ij}\) with \(w_{ii}=0\).

\[ \kappa_w = 1 - \frac{\sum_{i,j} w_{ij} O_{ij}}{\sum_{i,j} w_{ij} E_{ij}} \]

Common choices:

  • linear: \(w_{ij}=\frac{|i-j|}{K-1}\)

  • quadratic: \(w_{ij}=\frac{(i-j)^2}{(K-1)^2}\)

These only make sense when the class order is meaningful.

def plot_weight_matrix(w: np.ndarray, labels: np.ndarray, title: str) -> go.Figure:
    fig = px.imshow(
        w,
        x=[str(l) for l in labels],
        y=[str(l) for l in labels],
        text_auto=".2f",
        color_continuous_scale="Reds",
        title=title,
        labels={"x": "pred", "y": "true", "color": "disagreement"},
    )
    fig.update_yaxes(autorange="reversed")
    return fig


k = 5
labels = np.arange(k)
y_true = rng.integers(0, k, size=2_000)
noise = rng.choice([-2, -1, 0, 1, 2], size=y_true.size, p=[0.05, 0.15, 0.60, 0.15, 0.05])
y_pred = np.clip(y_true + noise, 0, k - 1)

k_unw = cohen_kappa_score_numpy(y_true, y_pred)
k_lin = cohen_kappa_score_numpy(y_true, y_pred, weights="linear")
k_quad = cohen_kappa_score_numpy(y_true, y_pred, weights="quadratic")

print(f"unweighted kappa = {k_unw:.3f}")
print(f"linear kappa     = {k_lin:.3f}")
print(f"quadratic kappa  = {k_quad:.3f}")

cm, labels_used = confusion_matrix_numpy(y_true, y_pred, labels=labels)
plot_confusion_matrix(cm, labels_used, title="Ordinal example: confusion matrix")
unweighted kappa = 0.601
linear kappa     = 0.767
quadratic kappa  = 0.879
w_lin = kappa_weight_matrix(k, "linear")
w_quad = kappa_weight_matrix(k, "quadratic")

fig = make_subplots(rows=1, cols=2, subplot_titles=["Linear weights", "Quadratic weights"])

fig.add_trace(
    go.Heatmap(z=w_lin, x=labels.astype(str), y=labels.astype(str), colorscale="Reds", showscale=False),
    row=1,
    col=1,
)
fig.add_trace(
    go.Heatmap(z=w_quad, x=labels.astype(str), y=labels.astype(str), colorscale="Reds", showscale=False),
    row=1,
    col=2,
)

fig.update_yaxes(autorange="reversed")
fig.update_layout(title="Disagreement weights (larger = more severe)")
fig

6) Using \(\kappa\) for optimization (threshold tuning)#

\(\kappa\) is computed from discrete predictions, so it is not differentiable with respect to model parameters. That means you usually don’t train a model by directly gradient-descenting on \(\kappa\).

A common pattern is:

  1. Train a probabilistic model with a differentiable objective (e.g., log loss)

  2. Pick a decision threshold on a validation set to maximize \(\kappa\)

Below: logistic regression trained from scratch with gradient descent, then a threshold sweep to maximize \(\kappa\).

def sigmoid(z: np.ndarray) -> np.ndarray:
    z = np.asarray(z)
    out = np.empty_like(z, dtype=float)
    pos = z >= 0
    out[pos] = 1.0 / (1.0 + np.exp(-z[pos]))
    exp_z = np.exp(z[~pos])
    out[~pos] = exp_z / (1.0 + exp_z)
    return out


def standardize_fit(X: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
    mean = X.mean(axis=0)
    std = X.std(axis=0)
    std = np.where(std == 0.0, 1.0, std)
    return (X - mean) / std, mean, std


def standardize_transform(X: np.ndarray, mean: np.ndarray, std: np.ndarray) -> np.ndarray:
    return (X - mean) / std


def add_intercept(X: np.ndarray) -> np.ndarray:
    return np.c_[np.ones((X.shape[0], 1)), X]


def log_loss_from_logits(y: np.ndarray, logits: np.ndarray) -> float:
    # Stable binary cross-entropy: mean(log(1+exp(z)) - y*z)
    return float(np.mean(np.logaddexp(0.0, logits) - y * logits))


def fit_logistic_regression_gd(
    X: np.ndarray,
    y: np.ndarray,
    lr: float = 0.2,
    n_iters: int = 2_000,
    l2: float = 0.0,
    record_every: int = 20,
) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
    """Return (weights, steps, losses). L2 excludes the intercept term."""
    X_i = add_intercept(X)
    y = y.astype(float)
    w = np.zeros(X_i.shape[1], dtype=float)

    steps: list[int] = []
    losses: list[float] = []

    for step in range(1, n_iters + 1):
        logits = X_i @ w
        p = sigmoid(logits)

        grad = (X_i.T @ (p - y)) / y.size
        grad[1:] += l2 * w[1:]
        w -= lr * grad

        if step % record_every == 0 or step == 1:
            steps.append(step)
            losses.append(log_loss_from_logits(y, logits) + 0.5 * l2 * float(w[1:] @ w[1:]))

    return w, np.array(steps), np.array(losses)


def predict_proba_logreg(X: np.ndarray, w: np.ndarray) -> np.ndarray:
    X_i = add_intercept(X)
    return sigmoid(X_i @ w)
# Synthetic, slightly imbalanced binary classification problem
n0, n1 = 2_000, 500
mean0 = np.array([-1.2, -0.2])
mean1 = np.array([1.0, 0.9])
cov = np.array([[1.0, 0.45], [0.45, 1.0]])

X0 = rng.multivariate_normal(mean0, cov, size=n0)
X1 = rng.multivariate_normal(mean1, cov, size=n1)
X = np.vstack([X0, X1])
y = np.r_[np.zeros(n0, dtype=int), np.ones(n1, dtype=int)]

X_train, X_tmp, y_train, y_tmp = train_test_split(
    X, y, test_size=0.4, stratify=y, random_state=42
)
X_val, X_test, y_val, y_test = train_test_split(
    X_tmp, y_tmp, test_size=0.5, stratify=y_tmp, random_state=42
)

# Standardize based on train only
X_train_s, mean, std = standardize_fit(X_train)
X_val_s = standardize_transform(X_val, mean, std)
X_test_s = standardize_transform(X_test, mean, std)

w, steps, losses = fit_logistic_regression_gd(X_train_s, y_train, lr=0.2, n_iters=2_000, l2=0.1)

fig = px.line(x=steps, y=losses, title="Logistic regression (GD): training objective")
fig.update_layout(xaxis_title="iteration", yaxis_title="log loss + L2")
fig
def sweep_thresholds(y_true: np.ndarray, proba: np.ndarray, thresholds: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
    kappas = np.empty_like(thresholds, dtype=float)
    accs = np.empty_like(thresholds, dtype=float)
    for i, t in enumerate(thresholds):
        y_hat = (proba >= t).astype(int)
        kappas[i] = cohen_kappa_score_numpy(y_true, y_hat)
        accs[i] = float((y_true == y_hat).mean())
    return kappas, accs


proba_val = predict_proba_logreg(X_val_s, w)
thresholds = np.linspace(0.01, 0.99, 99)
kappas, accs = sweep_thresholds(y_val, proba_val, thresholds)

best_idx = int(np.argmax(kappas))
best_t = float(thresholds[best_idx])

print(f"best threshold (val) = {best_t:.2f}")
print(f"best kappa (val)     = {kappas[best_idx]:.3f}")
print(f"accuracy @ best_t    = {accs[best_idx]:.3f}")

fig = go.Figure()
fig.add_trace(go.Scatter(x=thresholds, y=kappas, mode="lines", name="kappa"))
fig.add_trace(go.Scatter(x=thresholds, y=accs, mode="lines", name="accuracy", opacity=0.7))
fig.add_vline(x=best_t, line_dash="dash", line_color="black")
fig.update_layout(
    title="Validation sweep: kappa and accuracy vs threshold",
    xaxis_title="threshold",
    yaxis_title="metric value",
)
fig
best threshold (val) = 0.35
best kappa (val)     = 0.764
accuracy @ best_t    = 0.926
def eval_at_threshold(y_true: np.ndarray, proba: np.ndarray, t: float) -> dict[str, float]:
    y_hat = (proba >= t).astype(int)
    cm, _ = confusion_matrix_numpy(y_true, y_hat, labels=np.array([0, 1]))
    return {
        "threshold": float(t),
        "accuracy": float((y_true == y_hat).mean()),
        "kappa": cohen_kappa_score_numpy(y_true, y_hat),
        "tn": float(cm[0, 0]),
        "fp": float(cm[0, 1]),
        "fn": float(cm[1, 0]),
        "tp": float(cm[1, 1]),
    }


proba_test = predict_proba_logreg(X_test_s, w)

m_default = eval_at_threshold(y_test, proba_test, t=0.5)
m_best = eval_at_threshold(y_test, proba_test, t=best_t)

print("test @ t=0.50:", m_default)
print("test @ best_t:", m_best)

cm_default, _ = confusion_matrix_numpy(y_test, (proba_test >= 0.5).astype(int), labels=np.array([0, 1]))
cm_best, _ = confusion_matrix_numpy(y_test, (proba_test >= best_t).astype(int), labels=np.array([0, 1]))

fig = make_subplots(rows=1, cols=2, subplot_titles=["t = 0.50", f"t = {best_t:.2f}"])
fig.add_trace(go.Heatmap(z=cm_default, x=["0", "1"], y=["0", "1"], colorscale="Blues", showscale=False), row=1, col=1)
fig.add_trace(go.Heatmap(z=cm_best, x=["0", "1"], y=["0", "1"], colorscale="Blues", showscale=False), row=1, col=2)
fig.update_yaxes(autorange="reversed")
fig.update_layout(title="Test confusion matrices: default vs kappa-optimized threshold", xaxis_title="pred", yaxis_title="true")
fig
test @ t=0.50: {'threshold': 0.5, 'accuracy': 0.88, 'kappa': 0.5161290322580645, 'tn': 400.0, 'fp': 0.0, 'fn': 60.0, 'tp': 40.0}
test @ best_t: {'threshold': 0.35000000000000003, 'accuracy': 0.9, 'kappa': 0.6727748691099477, 'tn': 381.0, 'fp': 19.0, 'fn': 31.0, 'tp': 69.0}

7) Pros, cons, and good use cases#

Pros

  • Chance-corrected agreement: unlike accuracy, it explicitly subtracts expected agreement from the marginals

  • Works for binary and multiclass classification

  • Weighted variants handle ordinal labels naturally

Cons / caveats

  • Sensitive to class prevalence and prediction bias (the “prevalence paradox”): the same accuracy can yield different \(\kappa\)

  • Not defined for probabilistic outputs; you must pick a threshold (or argmax) first

  • Not differentiable → usually not a direct training loss (use it for model selection / threshold tuning)

  • Can be unstable on small datasets (marginals estimated with high variance)

Good use cases

  • Inter-annotator reliability (two humans labeling the same items)

  • Evaluating classifiers when you care about going beyond “matching prevalence” baselines

  • Ordinal classification (use weighted \(\kappa\))

Exercises#

  1. Construct a dataset where accuracy increases but \(\kappa\) decreases (hint: shift the marginals).

  2. For the threshold-sweep demo, optimize for accuracy instead of \(\kappa\). How do the chosen thresholds differ?

  3. Create your own custom weight matrix for ordinal labels (e.g., asymmetric penalties) and compute \(\kappa_w\).

References#

  • scikit-learn API: sklearn.metrics.cohen_kappa_score

  • Cohen, J. (1960). A coefficient of agreement for nominal scales.

  • Weighted kappa: commonly used for ordinal agreement (e.g., clinical ratings).